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ABSTRACT 

We present a general analytical formalism to calculate accurately several statistics 
related to underdense regions in the Universe. The statistics are computed for dark 
matter halo and galaxy distributions both in real space and redshift space at any 
redshift. Using this formalism, we found that void statistics for galaxy distributions 
can be obtained, to a very good approximation, assuming galaxies to have the same 
clustering properties as halos above a certain mass. We deducted a relationship be- 
tween this mass and that of halos with the same accumulated number density as the 
galaxies. 

We also found that the dependence of void statistics on redshift is small. For 
instance, the number of voids larger than 13/i _1 Mpc (defined to not contain galaxies 
brighter than M r = —20.4 + hlogh) change less than 20% between z — 1 and z = 0. 
However, the dependence of void statistics on as and f2 m /i is considerably larger, 
making them appropriate to develop tests to measure these parameters. We have 
shown how to efficiently construct several of these tests and discussed in detail the 
treatment of several observational effects. The formalism presented here along with 
the observed statistics extracted from current and future large galaxy redshift surveys 
will provide an independent measurement of the relevant cosmological parameters. 
Combining these measurements with those found using other methods will contribute 
to reduce their uncertainties. 

Key words: cosmology: theory — cosmological parameters — dark matter — large- 
scale structure of universe — galaxies: statistics — methods: analytical 



1 INTRODUCTION 

Large underdense regions in the Universe, commonly known 
as voids, are a very relevant signature of the large-scale 
structure of the Universe (LSS). Not until recently have they 
drawn too much attention from both observational and the- 
oretical sides. However, it is becoming more evident that 
voids play an important role as cosmological probes and in 
understanding the processes involved in galaxy formation 
(Peebles 2001; Plionis & Basilakos 2002; Hoyle et al. 2005; 
Conroy et al. 2005; Patiri et al. 2006a,b; Tinker, Weinberg 

* E-mail:jbetanco@iac.es 
f E-mail: spatiri@case.edu 



& Warren 2006; Hoeft et al. 2006; Park & Lee 2007; Croton 
& Farrar 2008; Tinker et al. 2008). 

Voids occupy large areas projected into the sky, and 
only recently galaxy redshift surveys have become large 
enough to allow systematic and robust studies of void statis- 
tics (see e.g. Croton et al. 2004; Hoyle & Vogeley 2004; Patiri 
et al. 2006; Ceccarelli et al. 2006 for analysis of voids in the 
2dF and Tinker et al. 2008 in the SDSS). On the theoretical 
side, important progress has been made recently by improv- 
ing analytical and numerical simulation modelling of voids 
(Mathis & White 2002; Gottlober et al. 2003; Colberg et 
al. 2005 for numerical simulations, Sheth & Van de Wey- 
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gaert 2004; Patiri, Betancort-Rijo & Prada 2006, hereafter 
PBP06, for analytical approaches). 

Most of the recent efforts concerning the statistics of 
voids have been focused on constraining different aspects 
of the formation and evolution of structure in the Universe 
rather than on assessing their ability to restrict cosmological 
parameters. The statistic of voids can, in principle, be used 
to put constraints to any cosmological parameters. However, 
these statistics may be especially sensitive to some of them. 
For instance, the abundance of voids larger than a given ra- 
dius depends on the normalization of the linear spectrum 
of mass fluctuations (noted as as). Furthermore, the abun- 
dance of voids as a function of their radius depends on the 
shape of the spectrum (usually denoted as F = Q. m h). Al- 
though these dependencies arise naturally from the current 
picture of structure formation, we must note that it has been 
difficult to prove them (see Little & Weinberg 1994; Tinker, 
Weinberg & Warren 2006). We argue that this might be due 
to the specific way how the galaxies are distributed within 
dark matter halos. In previous works, Halo Occupation Dis- 
tribution (HOD) and its variants have been used to model 
de distribution of galaxies within halos. In these methods, 
the models are fitted (for a given as, for instance) to re- 
produce the projected two-point correlation function, which 
renders very similar void statistics, resulting in an insensi- 
tivity of the void statistics with as (e.g. Tinker, Weinberg 
& Warren 2006). Also, the limited size of simulation boxes 
used in previous works could be an issue. Note that the 
most widely used void statistics to test these assumptions 
has been the so-called Void Probability Function (VPF). 
However, it has large sampling errors, and may be not the 
most efficient statistic, being other statistics like the num- 
ber of voids larger than a given radius a more appropriate 
approach (Plionis & Basilakos 2002; PBP06). 

Computing accurate predictions of void statistics for 
different cosmological parameters is not a simple task. The 
options are running a large suite of numerical simulations, 
for different sets of cosmological parameters, with enough 
volume to obtain a reliable void statistics or develop an an- 
alytical framework to predict the dependence of statistics 
of voids with cosmological parameters. As the former in- 
volves a major consumption of computing power in order to 
achieve the needed accuracy, in this paper we focus our ef- 
fort on the latter approach. In PBP06 we already developed 
an analytical framework to obtain the number densities of 
voids larger than a given radius defined by dark matter ha- 
los in a ACDM cosmology. Based on our previous work, 
we present here an extension to that formalism in order to 
compare model predictions with observational void statis- 
tics in a homogeneous way, and consequently, be able to 
extract cosmological information. The extension we refer to 
is done in two aspects. First of all, the relationship between 
the VPF and the number density of voids, which was deter- 
mined in PBP06 only for the largest void limit (for a given 
number density of dark matter halos), is extended here to 
deal also with smaller voids. In addition, the analytical ex- 
pression to compute the VPF of dark matter halos itself was 
improved. Secondly, we modified our formalism so that wc 
can now compute directly the predictions for the number 
of voids larger than a given radius observed within a given 
volume (i.e. taking into account the redshift distortions and 
the selection effects of a given survey) . 



Currently, there are several methods and experiments 
devoted to constrain cosmological parameters. CMB exper- 
iments such as WMAP (Hinshaw et al. 2008; Komatsu et 
al. 2008) in combination with galaxy clustering information 
(e.g. Sanchez et al. 2006; Seljak et al. 2006) have provided 
the best constrains to date. Methods based on clusters of 
galaxies and lensing are also promising (e.g. Bahcall et al. 
2003; Wang et al. 2003; Yoo et al. 2006; Vikhlinin et al. 
2008). However, as all methods have limitations and even 
degeneracies in some parameters, it is essential to provide in- 
dependent parameter estimations in order to improve the ac- 
curacy of the combined measurements. As mentioned above, 
voids might be specially interesting to measure some cosmo- 
logical parameters, such as the normalization of the ampli- 
tude of density fluctuations since they are tracing compara- 
ble scales. 

The necessary steps to achieve the goal of using the 
statistics of voids to constrain cosmological parameters arc 
presented as follows: in section 2 we present the new rela- 
tionship between the number density of voids [n(r)] and the 
VPF [denoted as Po(r)]. In section 3 we show how to ob- 
tain the VPF and other statistics for galaxy distributions, 
discussing the relationship between dark matter halos and 
galaxies. In section 4 we extend our formalism in order to 
compute voids statistics in redshift space. In section 5 we 
discuss the dependence of voids statistics on redshift and 
cosmological parameters (erg and T in particular). In sec- 
tion 6 we show how to handle in the theoretical calculations 
several observational effects that perturb the statistics of 
voids, namely the effect of the finite sample, the spectro- 
scopic completeness and the variations of the void statistics 
with redshift (i.e. the snapshot effect). In section 7 we pro- 
pose different tests for measuring cosmological parameters 
using void statistics. In section 8 we present a comparison 
of results obtained using our formalism with those found in 
numerical simulations (Millennium Run) . Finally, in section 
9 we present the discussion and conclusions. 



2 RELATIONSHIP BETWEEN THE VPF AND 
THE NUMBER DENSITY OF VOIDS 

The most common, widely used void statistic is the so-called 
Void Probability Function (VPF, White 1979), which is the 
probability that a randomly placed sphere with radius r is 
empty of objects (galaxies or dark matter halos). Another 
important statistic is the number density of voids (defined as 
maximal non-overlapping spheres) larger than a given radius 
r. Several works have established the relationship between 
these two statistics (see e.g. Otto et al. 1986; Betancort-Rijo 
1990; PBP06). However, they are only valid for rare voids, 
i.e. the largest voids in a given sample. 

In PBP06 we argued that most of the information car- 
ried by void statistics, concerning cosmological parameters, 
comes from rare voids. However, as we will see below, more 
common voids are still relevant to increase the statistics, 
which is fundamental to constrain cosmological parameters 
reliably. In order to take into account more common voids, 
the analytical relationships mentioned above have to be im- 
proved. 

In PBP06 we showed that the VPF [that we denote here 
as Po(r)] is related to the number density of voids larger than 
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a radius r [n v (r)\ by the following expression (for the rare 
voids limit): 



n v (r) 



3tt 2 (n'Vf 
32 V 



where 



nV = 



1 dlnPo(r) 
3 d In r ' 



Po(r) 



v 4 3 
= 3 7VT 



(1) 



(2) 



and n'V is the mean density of points in the surface of a 
randomly chosen empty sphere with radius r, that we denote 
in term of the derivative of Po(r) with respect to r. 

Equation (TTJ gives an unique functional relationship be- 
tween Po(r) and n v (r). As mentioned above, this works well 
for rare voids, but for more common voids the existence 
of a unique functional relationship has to be studied. Note 
that the VPF is determined by all the hierarchy of corre- 
lations functions (White 1979), but the VPF itself do not 
determine uniquely all these correlations. For instance, two 
samples could have the same VPF and still differ in some 
aspect of the clustering, which will render different number 
densities of voids larger than a given radius. To address this 
issue we carried out a detailed analysis using the Millennium 
Run numerical simulation (Springel et al. 2005; see section 
[SJ. We found evidence for an unique functional form of the 
VPF for the distributions relevant to this work. We write 
this expression as an extension of the relationship given in 
equation |T}, i.e. 



n v (r) ~ °- 68K ( r ) e -3.5X(r)[l-2.18K(r)] (3) 



and 



■rr, n ( 1 dla Po(r)\ „ , , , , 4 •) ... 
ff(r)=^- 5 _i2j P (r) ; V = (4) 

These equations are valid for K (r) 0.46, while for K(r) > 
0.46 n v (r) = 0.313/V. The quantity K(r) measures the 
rareness of the voids. In the rare void limit K(r) goes to 
zero, so the exponential factor is close to one, recovering the 
original equation Q. Note also that the coefficient 37r 2 /32 
shown in equation {TJ is replaced by 0.68 in equation @- 
The former coefficient was originally introduced by Preskill 
& Politzer (1986), but we found the latter to be more accu- 
rate (Betancort-Rijo, in preparation). 

In the limit in which K(r) goes to zero, all voids larger 
than r are only slightly larger than r, so that their mean 
volume, V(r), is only slightly larger than V(r). Thus, using 
equation we have for the fraction of volume occupied by 
voids, F(r): 



F(r) = n v {r)V(r) > n v (r)V(r) ~ 0.68K(r) 



(•») 



For K(r) less than 0.1, F(r) is smaller than 0.07 and so the 
voids can be considered rare. For these voids exponential fac- 
tor in equation ([3]) differs less than a 10% from 1 so that the 
asymptotic expression [eq. ([1} with the adequate coefficients] 
is a good approximation. As K(r) increases, the exponen- 
tial factor decreases, reaching a minimum at K(r) ~ 0.23. 
Here, the exponential part takes a value close to 2/3. For 



K(r) > 0.23, the exponential factor increases again, recov- 
ering the value of 1 for K(r) ~ 0.46. Note that this cor- 
responds to rather common voids occupying more than one 
third of the volume of the sample. Further, for very commom 
voids n v (r) is practically independent of K(r). In this limit 
the problem degenerates into a problem of random packing 
of spheres of unequal radius (see e.g. Shi & Zhang 2008), 
which is far beyond the scope of this work (it is worth to 
mention, however, that equation Q provides a reasonably 
good approximation even in this limit, although it have care- 
fully been checked only for K ^ 0.42.) 

To compute Po{r) we start from a more general statistic 
that is P n (r). This denotes the probability that a sphere of 
radius r, placed at random within the distribution, contains 
n objects (Layzer 1954). Note that for small values of n, 
Pn(r) is still characterizing underdense regions. As we will 
see, this statistic is very important for our purposes. For 
point distributions conforming to a random, non-uniform 
Poissonian process (Peebles 1980) we have that 



P(u) — -e y 



du 



(6) 



where P(u) is the probability distribution for the integral of 
the probability density, u, within a randomly placed sphere. 
In PBP06 we showed that for dark matter halos, u can be 
written as: 



u=[fiV(l + 5)][l+6 n3 } 



(7) 



where n denotes the mean number density of those halos in 
the sample (usually halos larger than some given mass). V 
is the volume of the sphere. 8 is the actual enclosed density 
contrast within the sphere. The first term in the right hand 
side of the equation is the integral of the probability density 
within the sphere for halos tracing the mass (i.e. no bias, 
which is true in the very low mass limit). In general, halos 
are biased tracers of the underlying mass distribution, due 
to the initial clustering of the proto-halos before they move 
along with mass (i.e. the statistical clustering). The second 
term of the equation accounts for this biasing. In PBP06 
we obtained an approximation for this bias as a function of 
the linear enclosed density contrast within the sphere (8i) 
written as: 



l + <M5i) =A(m)e 



-b(m)8f 



V 5,<-l 



(8) 



where A(m), b(m) are coefficients mainly depending on 
the halo mass, and to a lesser extent, on the size of the 
sphere (see Rubino-Martin et al. 2008). We use the complete 
Zeldovich approximation (CZA, Betancort-Rijo & Lopez- 
Corredoira 2001) to obtain the actual enclosed density con- 
trast (S), as a function of the eigenvalues of the linear de- 
formation tensor, A, and since Si — Xi, we can write u as 
a function of the A values. Equation ((6]) can be then written 
as: 

Pn(r) = j I |p( Al! A 2 ,A3,r)M^x 

xe Hu(Ai)1 dAidA 2 dA 3 (9) 
where P(Xi,r) is the probability distribution for the Ai 
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within a sphere of radius r chosen at random in Eule- 
rian space (more precisely, the A values on the Lagrangian 
patches that transform into that sphere). Betancort-Rijo & 
Lopez-Corredoira (2002) showed that P(Xi,r) can be de- 
rived from the probability distribution for the A within a 
sphere of constant Lagrangian radius Q chosen at random 
in Lagrangian space (Doroshkevich 1970). Note that in this 
last equation a triple integral is involved. However, in the 
cases we are interested in [i.e. where n is rather smaller than 
the mean (nV)], this equation can be approximated without 
loss of accuracy by an equation involving only one integral: 

P n (r)= / 1 ' B p(* ! ,r)M^T e [-«('«)] dSl (10) 

where u(5i) is now a function of Si (and implicitly of r) 
through the dependence of the actual density contrast, S, 
on its linear counterpart, Si- We write this functional de- 
pendence as 



u(5i) = (nV[l + 8(8i,r)])[l+5 ns (5i)]. 



(11) 



8(Si,r) is basically the relationship between the actual and 
the linear density contrast within a sphere as given by the 
standard spherical collapse model, except for a small cor- 
recting term depending on r (see eq. A3 in Appendix A). 
Note that in PBP06 we did not use this correcting term, 
so the values of P(r) obtained there have a small but non- 
negligible error. P(8i,r) is the probability distribution for 
the linear density contrast within a sphere chosen at ran- 
dom in Eulerian space. See Appendix A for details on how 
to efficiently evaluate equation (|10[) . 

In PBP06 we argued whether halo clustering conform 
exactly to a random Poissonian model. In this model, objects 
are placed in the distribution accordingly to an underlying 
probability density field, but independently of the actual 
position of the placed points. In fact, halos have an exclusion 
region around them that can not be accounted for within 
that model. However, we do not detect a deviation from the 
Poissonian model. This is probably due to the fact that at 
least in these underdense environments the mean distance 
between halos is much larger than the mentioned exclusion 
region (see also Conroy et al. 2005). 



3 VOIDS IN GALAXY DISTRIBUTIONS 

In the previous section we described how to compute predic- 
tions for the number density of voids defined by dark matter 
halos. However, we do observe galaxies, which are assumed 
to be embedded in those dark matter halos. Hence, we need 
to establish a relationship between them in order to obtain 
Pn(r) for galaxy distributions and then be able to compare 
model predictions with observations. 

From equation (|10p we see that the relationship between 
halos and galaxies enters only through S n3 , which quantifies 
the biasing of the galaxies with respect to the underlying 
matter distribution. We denote the function carrying the 
biasing for galaxies as Sls- This function is determined by 
the Sns for halos through the relationship of galaxies with 
halos. The general equation to compute P n {r) for a galaxy 
distribution can be written in a similar way than equation 



(|10|) . but replacing S„ s by Sls - In PBP06 we found (see also 
Yang et al. 2003) that 

i I r fx j \ /o° 0$ (> L\m,Si)n c (m,Si) dm 

1 + 6l.{8 u L) = . (12) 

Here, n c (m, 8i) is the conditional mass function within a re- 
gion with linear density fluctuation Si. $(> L\m,8i) is the 
cumulative conditional luminosity function and $ u (> L) is 
the unconditional one. It is worth to mention that in this 
equation we allow for a dependence of the conditional lu- 
minosity on the environmental density through Si . However, 
there is an increasing evidence favoring a conditional lumi- 
nosity not depending on environment for a large range of 
luminosities (see Tinker et al. 2008; Tinker & Conroy 2008). 

Given the conditional luminosity function, $(> 
L\m,5i), equation (|12[) can be used to obtain Sls as a func- 
tion of Si. However, this is neither simple nor straightfor- 
ward. Here, we prefer to determine a functional form from 
general considerations and calibrate it with numerical sim- 
ulations. We know that for halos above any given mass the 
dependence of 1 + 5 ns on Si is accurately fitted by a Gaus- 
sian (equation [8]). This implies that 1 + 5„ s have to be well 
approximated by a Gaussian for halos with mass within any 
mass interval. Now, in equation (|12[) we see that 1 + Sl s 
is the average of the value of 1 + 8 ns over all halo masses 
(weighted by the probability distribution for the mass of 
a halo containing a galaxy with luminosity larger than L). 
Therefore, 1 + Sls must be approximated by a Gaussian with 
an accuracy comparable to the accuracy of the approxima- 
tion of 1 + Sls by Gaussians within the relevant range of 
mass values. Then we can write: 



1 + 5 Ls = 
b(L) = 
A(L) = 



A(L) e~ b( - L ^; 
b(m g ) 
A(m g ) 



(13) 
(14) 
(15) 



where b(m g ), A(m g ) are the same functions b(m),A(m) de- 
fined in equation ((Sj evaluated at mass m g . Equation (|13|l 
means that the biasing of galaxies with luminosity larger 
than L with respect to the underlying mass distribution is 
equal to the biasing of the halos with mass larger than m g . 
Let m be the mass such that the number density of all halos 
with mass larger than m is equal to the number density of 
the galaxies under consideration, n, i.e. n(> m) = n g [where 
n(> m) is the cumulative cosmic mass function]. The rela- 
tionship between m g and m is given by: 



m g — m(1.396) 



cr(m) 



(16) 



where er(m) is the rms linear density fluctuation on scale 
m. For very large masses cr(m) < 1 and m g is very close to 
m, but for masses with aim) > 1, m g may be substantially 
larger than m. Note that m g is not the mass (lower limit) of 
the halos containing the galaxies under consideration, but 
the mass such that the clustering properties of the galaxies 
is equal to that of all halos more massive than m g . 

The form of equation (|16[) . where m g /m depends only 
on <r(m) is what is expected if the clustering of the galax- 
ies under consideration is determined only by the process 
of gravitational hierarchical collapse. The specifics of the 
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galaxy formation processes are not relevant in as much as 
equation (|13[) is a good approximation. The number density 
of galaxies of a given type above a given luminosity does 
strongly depends on those specifics, but the relationship be- 
tween the number density (which gives m) and the clustering 
(given by m B ) does not. 

Equation (|13p is the result of fitting a functional form 
derived from general considerations to the results of the nu- 
merical simulations described in section [8] It must be noted, 
however, that the values of a used in fitting equation (|13p 
only goes from 1.6 to 2.6, which is enough for the masses 
and redshifts relevant to our present problem. Further tests 
must be imposed on this equation before extrapolating it far 
beyond the well checked range of a values. 

In Summary, to obtain the voids statistics for galaxies 
brighter than a given luminosity, we use equation (|10|l with 
1 + S n3 equal to that of halos larger than a certain mass m g , 
which is determined by the number density of the galaxies. 



4 VOIDS IN REDSHIFT SPACE 

The equation given in the previous section for P n {r) cor- 
responds to voids in real space. However, galaxy positions 
provided by redshift surveys are in redshift space. In Patiri 
et al. (2006) we introduced a method based on the standard 
spherical collapse model to transform to real space, one-by- 
one, the voids found in the 2dF Galaxy Redshift Survey. In 
the present work we will use a similar method. However, 
instead of transforming the observed statistic we will mod- 
ify our equations in order to compute the model predictions 
directly in redshift space, which is more straightforward. 

In the spherical expansion model the peculiar velocity 
of matter at distance r from the center of a sphere with 
actual enclosed density contrast S(r) is given by: 

V(r) = Hr VEL[<5(r)] (17) 

where H is Hubble constant and VEL[<5(r)] is a unique func- 
tion of the enclosed density contrast (for a specific cosmol- 
ogy, see Appendix A). The matter distribution within a max- 
imal sphere (i.e., our definition of voids) is not exactly spher- 
ically symmetric, and outside that sphere the distribution of 
matter in scales much smaller and much larger than the ra- 
dius is strongly non-spherical. However, the average velocity 
field around that sphere [over all voids with given r, S(r)] 
is described well by the standard spherical collapse model. 
This mean velocity field play a key role in transforming the 
statistics under consideration from real to redshift space. 

A sphere with radius r in real space and with inner mean 
fractional density 8 has a mean outflow in its surface given 
by equation (|17p . In redshift space, that sphere transforms 
into a prolate spheroid elongated along the line of sight. The 
semiaxes along that direction in redshift space, rjf, is given 
by: 

= r(l + VEL[£(r)]) (18) 

However, the transverse axes r\_ remain equal to r. This 
result is exact for any value of VEL[5(r)]. Conversely, for 
small values of VEL[<5(r)], a sphere with radius r* in redshift 
space is approximately transformed into an oblate spheroid 



in real space with the smallest semiaxes, rii, along the line 
of sight. This semiaxes is related to r* by: 

r y =r*(l + VEL[tf(r)]) -1 (19) 

and 

r± = r* (20) 

Now, in equation (|10[) the dependence on r enters through 
P(Si,r) and through <5(<5;,r) (a small dependence) and in 
both of this quantities (see Appendix A) r enters through 
cr(Q) (the rms of the linear enclosed density contrast within 
a Lagrangian sphere with radius Q). The relevant value of 
Q here is r[l + <5(r)] 1//3 . Thus, in equation (|10|l the radius r 
enters in the combination a(r[l + ^(r)] 1 ^ 3 ). In this equation, 
r is the radius of a sphere in real space, but as we want the 
P n (r*) on redshift space, the relevant body in real space, is 
no longer a sphere but an spheroid. The relevant u(Q) must 
now be evaluated within an spheroid. However, Betancort- 
Rijo & Lopez-Corredoira (2002) showed that for spheroids 
not differing much from spheres a is almost independent 
of the form of the spheroid, depending only on its volume. 
Thus, we may use for r the radius of a sphere with the same 
volume as the spheroid, which is given by 

r = (rlr||) 1/3 = r*(l + VEL[<5(r)])" 1/3 (21) 

To obtain P n {r) in redsift space, that we represent by 
P£(r*), within this approximation, we may use equation (|10|l 
with the following replacement: 

[l + S(S h r)] -» [l + <5(<5i,r)][l+VEL(<5i)] (22) 

The reason for this replacement is that the procedure fol- 
lowed to obtain equation (|10[) . which corresponds to real 
space can also be followed in redshift space. The only differ- 
ence with real space is that in redshift space the role of the 
density contrast, S, is played by an apparent density con- 
trast 5' (the right hand side term of equation (|22|l is simply 
1 + S'). 

The proposed replacement only takes into account the 
redshift distortion due to the smooth spherical velocity field 
associated with the underdensity within the sphere under 
consideration. The fluctuation of the velocity field around 
its mean is the source of an additional effect on the val- 
ues of Pn(r). We found using numerical simulations that 
the rms radial displacement of halos surrounding voids is 
1.4/i Mpc (Patiri et al. 2006). As a consequence of this 
random displacement some halos are driven away from the 
void, as defined in real space, while others are drawn closer 
to the void. So, since the density profile around a void is in- 
creasing, there are more halos being pulled in than out and 
the net effect is a contraction of the voids, or alternatively, a 
value of P„(r* — r) smaller than P n {r). This effect is much 
smaller than the previous one, specially for the rather rare 
voids we are interested in. We suggest that this effect may 
be accounted for by a replacement in equation (|10|l of the 
form: 
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n^nl+Al — j (23) 

where A is a constant. We find, using numerical simulations, 
A ~ 6. Using these replacements (equations [22] and [23]) 
in equation (|10|l we find results for P^(r*) and the number 
density of voids that are in excellent agreement with the 
numerical simulations analyzed in this work (see Appendix 
A for details). Although these replacements represents the 
correct procedure, we shall also use, for computational rea- 
sons, an alternative procedure, in which we simply replace r 
by certain function of 5 (equation A9 in Appendix A) , find- 
ing very similar results to those found using the procedure 
described above. 



5 DEPENDENCE OF THE NUMBER 

DENSITY OF VOIDS ON REDSHIFT AND 
COSMOLOGICAL PARAMETERS 

In this section we focus our attention on how the VPF and 
the number density of voids larger than a given radius de- 
pend on redshift and cosmological parameters ag and P. 
Here we present the basics of the idea, referring the reader to 
Appendix A for details on the explicit procedure. In equation 
(|10|) we can see that the VPF depends on the cosmological 
parameters through the linear rms, of density fluctuations 
[a(r)] which enters through the probability distribution for 
Si on the scale r. There is also a dependence entering through 
Sns given in equation (|8]). 

The ACDM transfer function (e.g. Bardeen et al. 1986) 
is essentially determined by a single parameter, T, which de- 
termine the co-moving horizon scale at matter domination. 
The barion density play a small role and is negligible to most 
of the cases in which our framework applies. The relevant 
parameters determining the power spectrum are ag (its am- 
plitude) and r (its shape). In Appendix A we give a(r) as 
an explicit function of this two parameters, i.e. a(r, ag,T). 
It is important to note that the coefficients A(m) and b(m) 
in equation §8$ also depends on ag and P (see equation I A7[) . 

Once we have determined the dependence of the VPF 
and the number density of voids with the cosmological pa- 
rameters, the redshift dependence for a given set of param- 
eters may be obtained replacing ag by: 

D(z) , , / D(z) \ , , 

a8 c(7^o) ; ^ r '*) = H r ' a8 ^o)J (24) 

where D(z) is the linear growth factor of density fluctua- 
tions in the model under consideration. n v (r,ag) is n v (r) 
as a function of ag. This equation reflects the fact that the 
dependence of the void statistics (in real space) with red- 
shift enters only through ag. However, the void statistics 
in redshift space shows a substantially smaller dependence 
on redshift (up to z ~ 1). We can see the difference with 
an example. In figure [T] we show the number density of 
voids larger than 11, 13 and 15/i _1 Mpc (thinest to thickest 
lines respectively) defined by galaxies with number density 
5 x 10 _3 (ft _1 Mpc) -3 as a function of redshift, both in real 
and redshift spaces (full, dashed lines respectively). Note 
that the sample of galaxies defining the voids is selected to 




0.2 0.4 0.6 0.8 1 
redshift 

Figure 1. The number density of voids larger than 11, 13 and 
15/i — x Mpc (thinest to thickest lines respectively) defined by 
galaxies with number density 5 X 10 — 3 (h~ 1 M.pc)~ 3 as a func- 
tion of redshift. Full lines denote voids in real space, and dashed 
lines redshift space. 



keep their number density fixed at any redshift. Therefore, 
the value of m 9 changes as m changes with redshift. Also, 
for a given value of m, a(m) changes with z (see Appendix 
A). 



6 CORRECTIONS DUE TO OBSERVATIONAL 
EFFECTS 

6.1 Voids within a finite sample 

The quantity n v (r) represents the mean number density of 
non-overlapping maximal spheres with radius larger than r. 
These maximal spheres can be determined precisely for an 
arbitrarily large sample but this is not the case for finite sam- 
ples. Consider a large sample (much larger than the mean 
distance between the voids) and assume that all maximal 
spheres larger than r are located sufficiently far away from 
its boundaries. Consider now a smaller sample entirely con- 
tained within the region containing these maximal spheres. 
The mean number of maximal spheres larger than r that are 
entirely contained within the finite sample is given by: 



N(r) = 




where AV(r) is the available volume within the sample for 
the centers of the spheres of radius r, and the parenthesis is 
the number of maximal spheres with radius between r and 
r + dr. Integrating by parts the r.h.s. of equation (19) we 
have: 
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N{r) = n v (r)AV(r)+ [°° n v (r) dAV{r '^ dr 

Jr dr 

~ n,(r)AV(r)+^ H nv[r') dr' . (26) 



This last approximation is valid only for values of r such 
that the mean size f ' of all voids larger than r is only slightly 
larger than r. For f' we have: 



As we pointed out before, the net effect of the boundary 
is to reduce slightly the size of the spheres and to retract 
their centers. The first effect results in a smaller value of 
n v (r) for any value of r, but the second produce the oppo- 
site effect as it is enclosing the centers of the spheres in a 
smaller volume. This last effect dominates, so that the num- 
ber density of locally maximal spheres within a finite sample 
is larger than that for the actual maximal spheres. 

This boundary effect can be taken into account modi- 
fying the available volume by: 



r'(r) 



= r + 



_L_ r , dn v (r') dr , 

n v (r) J r dr' 
1 



n v (r) J r 



n v (r') dr' 



Using this in equation (126(1 we have 



AV L (r) = AV{f'{r) - a[r (r) - r]) 



(32) 



where AVl(t) represent the available volume for locally 

(27) 

maximal spheres, and a is a number between 2 and 4. The 
number we obtain in the Millennium Run numerical simu- 
lation is 3.3. Thus, we have: 



N(r) ~ n v (r)AV[f'{r)]. 



(28) 



For simplicity, here we will consider samples that are either 
a box of side I or a wedge defined by two parallels and two 
meridians in the sky, with depth R. For the first case AV(r) 
is given by: 



AV(r) = (L — 2r) 3 



(29) 



while for the second case we have (see Patiri et al. 2006): 



AV(r) = 



P{u, r)u 2 du 



(30) 



where 



P(u,r) 



sin(5o + AS — sin 1 (r/u)) 
— sin((5o + asin(r/u)j 



Aa — 2 asin 



asin(r / u) 
cos(S + A6/2) 



(31) 



So, So + AS are the limits of the sample in declination while 
a, a + Aa are the limits in right ascension. The radial limit 
is R. 

Equation (|28|l gives approximately the mean number 
of maximal spheres larger than a given radius r within the 
sample when those maximal spheres have been determined 
within a sample much larger than the actual one. In practice, 
however, the maximal spheres are determined using only 
the actual sample. Thus, the maximal spheres close to the 
boundaries may, actually, not be maximal spheres but locally 
maximal spheres. These spheres are slightly smaller than the 
actual maximal spheres and their centers are biased towards 
the center of the sample with respect to the centers of the 
actual maximal spheres. To avoid this effect one could simply 
discard all the locally maximal sphere which are so close to 
the boundaries that it can not be known by certain that 
they are actually maximal spheres. But in this manner much 
information is lost, specially in narrow wedge samples. So, 
the appropriate thing to do is to use all locally maximal 
spheres and account for the border effect. 



AV L (r) = AV{r - 2.3[f'(r) - r}) r > r 
AV L (r) = AV{r - 2.3[f'(r ) - r ]) r sC r 
where ro is defined by: 
K(r ) = 0.34 



(33) 



(34) 



and K is defined in equation (4). The interesting cases are 
those with r ^ ro, but written in this form the expression 
is valid even for very common voids, at least down to the 
largest K values we explored (K = 0.43). Finally, we have 
for the mean number of locally maximal spheres within the 
sample, Nl(v), 



Nl(v) = n v (r)AV L (r) 



(35) 



where n v (r) is the mean number density of actual maxi- 
mal spheres (given by equation 3) and AVl(t) is given by 
(|33[) . We shall use this expression to estimate the sample 
independent quantity, n v (r), from the observed statistics, 
Nl{t). Note that the actual number density of locally max- 
imal spheres within the sample, ni(r), is given by: 



N L {r)=n L {r)AV[f'{r)\ 



(36) 



but fiL(r) is a sample dependent quantity. 

In what follows we drop the distinction between local 
and actual maximal spheres. Samples are always finite, so 
it is understood that some of the 'maximal spheres' found 
in them, those close to the border, are only conditionally 
maximal. However, we use them to estimate the sample in- 
dependent number density of actual maximal spheres. 



6.2 Spectroscopic completeness 

In the equations considered so far, the local number density 
of galaxies is modulated by the underlying density fluctua- 
tions and their corresponding bias. However, in real galaxy 
redshift surveys, there exist fluctuations in the local num- 
ber density of galaxies due to the failure of taking spectra of 
some galaxies. The completeness, defined as the ratio of suc- 
cessfully obtained redshifts to targetable objects, varies non- 
trivially from to 1, having angular and magnitude depen- 
dencies. In this paper we assume a constant completeness in 
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magnitude, to focus in the spectroscopic completeness. This 
issue have different origins and usually lay around 10% de- 
pending on the survey (see e.g. Norberg et al. 2002; Conroy 
et al. 2005). These fluctuations, must be incorporated into 
our framework to obtain predictions for the number of voids 
expected within real galaxy samples. The essential question 
is whether the correlation for the completeness fluctuations 
is much larger than the size of the voids. If the completeness 
changes over the size of a void, the formalism developed in 
this work need to be re-elaborated to some extend. However, 
if the correlation length of the completeness fluctuations is 
much larger than the size of the voids, these fluctuations be- 
have as a random dilution of the number of galaxies. Thus, 
Equation (10) can be used replacing n by its diluted value. 
So, P n (r) is given by: 



[\p» 

Jo 



r, nc)\ P(c) dc, 



(37) 



where the function P n (r,n) is the same as in equation (10) 
and P(c) is the probability distribution for the spectroscopic 
completeness, c. n denotes the intrinsic mean number den- 
sity of galaxies. Note that the observed density within the 
sample is nc. To obtain the number density of maximal 
spheres, we have to change P„ (r) by n v in the previous equa- 
tion. 



6.3 The snapshot effect 

The equations presented above compute the predictions for 
the number of voids assuming a constant value of z over 
the sample. However, in observational samples there is a 
dependence of redshift with the radial coordinate. Thus, due 
to the dependence of voids number densities with redshift, 
there exist a radial dependence on those densities. 

The number density of voids at a distance u from the 
origin is given by n v [r, z(u)] where z(u) is the relationship 
distance-redshift. For a rectangular strip the expected num- 
ber of voids is given by: 



(r) — / P(u,r)n v [r, z(u)]u du (38) 



«™(A5/2) 



where 



r = r — 2.3[r'(r) — r] 



(39) 



where P(u,r) is given in (|31[) ; f' is given in (127J1 . and R is 
the depth of the sample. For small values of z, f2 m = 0.3, 
SIa = 0.7, z(u) may be approximated by: 



z(u) ~ 3.336 x 10~ 4 u + 2.7 x 10 



-8 2 
U 



(40) 



with co-moving distance u in units of /i~ 1 Mpc. The mean 
number density of voids within the sample is obtained di- 
viding the expected number of voids by the volume available 
for those voids, AVh{r), given by equation (l33ll . 



7 TESTS TO ESTIMATE COSMOLOGICAL 
PARAMETERS 

The framework presented here can be used to perform tests 
to measure the cosmological parameters. The efficiency of 
any test can be addressed rigorously using our formalism, so 
that we may design an optimum test. Here, we will consider 
tests built using the VPF and the number density of voids. 



7.1 VPF vs. number density of voids 

For both the VPF and the number density of voids larger 
than a given radius, most of the information concerning cos- 
mological parameters comes from extreme events. However, 
implementing all the useful information (i.e. including more 
common voids) can be used to improve the statistics. In or- 
der to determine the efficiency of a test to measure a given 
generic cosmological parameter, let say a, we can construct 
a simple test considering only the voids larger than a given 
radius r. The rms of that parameter within this test would 
be: 



l v (r)] 



where 



G(r,a) 



n v (r)G(r,a) 



d\nn v {r) 
din a 



(41) 



(42) 



The fractional rms error for the estimate of n v (r) for a 
given sample is that corresponding to an uniform Poissonian 
distribution with a correction term, assuming that voids are 
uncorrelated (which seems to be the case, see Patiri et al. 
2006). Therefore, 

n v (r) N{r) L ' z 

where N(r) is the expected number of voids larger than a 
given radius r. The last parenthesis accounts for the anti- 
correlation for voids with distances between their centers 
less than 2r . This factor is important only for small voids. 
Then, the best test using this void statistic (i.e. all voids 
larger than r) is obtained by minimizing equation (|4ip with 
respect to r. For the relevant range of parameters, the same 
value of r is obtained for any value of a. The efficiency of 
this test is somewhat smaller than that of the best test using 
all the voids within the sample, but it remains as simple as 
for a first approximation. 

In a similar way, to study the ability of the VPF to 
estimate the parameter a we may consider a simple test 
using the value of the VPF for a given value of r. In this 
case, we have for the error: 



rms(a) — 



i[Po(r)} 



P (r)G'(r,a) 



where 



G'(r,a) = 



ilnfb(r) 
din a 



(44) 



(45) 



In Patiri et al. (2006) we have shown that 
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m4 P o(r) ] =2 .82=^Mp„(r). 



n„(r) 



We then have for the rms(a): 

rms(a) = 2.82- 



lv{r)] 



(46) 



(47) 



n v (r) 

Example values of G, G' are given in table Q] 



7.2 Maximum Likelihood test 

Another test is the Maximum Likelihood test. To perform 
this test, we consider all voids larger than a given radius. 
This radius could be chosen arbitrarily small, taking in ac- 
count that very small voids do not carry any cosmological 
information. The probability distribution for the radius of 
all voids larger than ro is given by: 



P(r) = 



1 dN(r) 
N(r) dr 



(48) 



where N(r) is the expected value for the number of voids, 
larger than r, within the sample (for a given set of cosmo- 
logical parameters). 

As in this paper we are particularly interested in as 
and F, the likelihood of the observed data for that set of 
parameters is 



Hn,T) =HP(n : a s ,T). 



(49) 



This is so because the sizes of two different voids are in- 
dependent random variables. With this function the best 
estimate may be obtained maximizing with respect to erg 
and r while the confidence levels are obtained following the 
standard Bayesian approach. 



7.3 



test 



A less sophisticated test, but almost as efficient as that of 
Maximum Likelihood, is the x 2 test. This test can be done 
computing the x 2 , where the number of degrees of freedom 
corresponds to the number of bins in radius: 



74 r ^ t 

2/ r x [Ni 



AM^,r)] 2 



rms{Ni)} 2 



(50) 



where TV, is the number of voids with radius within the i-th 
bin found in the observational sample. On the other hand, 
the expected value for the number of voids in the same bin 
in radius for given values of <rg,T is: 



AV(r l+ i)(cr 8 ,r) - 
-N v {n)(as,T) 



(51) 



where n+i and n are the boundaries of the i-th bin and 
Nv(r) is given in equation (|36|l . The rms of Ni is given by 
equation (|43l) (note the change in the subscript L by V) and 
it is well approximated by Ni. 

The statistical significance of the x 2 test is that of a x 2 



statistic with n degrees of freedom because the number of 
voids in each bin is an independent random variable. The 
width of the bins must be chosen wide enough to contain a 
large number of voids, so that this number may be assumed 
to follow a Gaussian distribution. Even though Ny(r) can 
be computed directly with our analytic equations, its ac- 
curacy is in the order of 5-7% which could slightly change 
the result of the test. To avoid this, we suggest to use mock 
catalogs with large volumes in order to better quantify the 
sampling errors. We can then use our analytic framework 
to extrapolate the statistics found in mock catalogs to other 
values of cosmological parameters. The values for Ni can be 
computed in the following way: 



N(a 8 ,T) = Nr m F(a s ,T) 



and 



N(a 8 ,r) 



N, 



(52) 



(53) 



where <r| lm , r slm are the cosmological parameters for which 
the numerical simulation was ran. Note that F is simply the 
ratio between the expected values obtained with our analyt- 
ical expression for the values of the corresponding cosmolog- 
ical parameters and those values obtained in the numerical 
simulation. 



8 COMPARISON WITH NUMERICAL 
SIMULATIONS 

We have to use cosmological numerical simulations exten- 
sively in order to test the predictions made by our formalism. 
We also need these simulations to calibrate the equations 
with free parameters. 



8.1 Cosmological A-body Simulation and Galaxy 
Formation Model 

In this work we took advantage of the publicly available 
dark matter halo and mock galaxy catalogs produced with 
the Millennium Run simulation (Springel et al. 2005) and 
the 'MPA' semi-analytic model of galaxy formation (SAM) 
applied to it (Croton et al. 2006; De Lucia & Blaizot 2007). 
The Millennium Run simulation follows the evolution of 10 10 
dark matter particles in a periodic box of 500/i _1 Mpc on a 
side with a mass resolution per particle of 8.6 x 10 s h~ 1 M.Q. 
The initial conditions of the simulation were run with cos- 
mological parametes consistent with the combined analysis 
of the 2dFGRS and WMAP1 data (r = 0.1825, <r 8 = 0.9). 
The halos in the simulation were identified in each time step 
using a friend-of-friends algorithm with a linking length of 
0.2 times the mean particle separation. For full details of the 
simulation we refer the reader to Springel et al. (2005). 

The gravitational potential of each halo accretes the 
surrounding gas from which galaxies form. The semi- 
analytic models tracks, in a parametrized way, various phys- 
ical processes that are supposed to play a key role in galaxy 
formation such as radiative cooling of hot gas, star forma- 
tion in the cold disk, supernova feedback, black hole growth 
and AGN feedback through the 'quasar' and 'radio' epochs 
of AGN evolution, metal enrichment of the inter-galactic 
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and intra-cluster medium, and galaxy morphology shaped 
through mergers and merger-induced starbursts. 

The full galaxy catalog produced with the SAM (also 
known as 'Delucia2006a' catalog]) contains information for 
about 12 million galaxies brighter than M r — —17. For each 
of these galaxies we have available, among other properties, 
positions and velocities, magnitudes in several band passes 
(Johnson, Busher, 2MASS as well as the 5 SDSS bands), 
stellar mass and the mass of its parent dark matter halo. 

In order to quantify errors, we divided the full 
500/i _1 Mpc on a side box in 8 small boxes of 250/i *Mpc 
side each. From these boxes we constructed halo and galaxy 
samples, in both real and redshift space. To produce red- 
shift space catalogs we used the distant observer technique. 
We selected for our samples all halos with masses larger 
than 6.6 x 10 11 /i^'Mq and galaxies brighter than M r — 
—20.4 + 5logh. The mean number density of objects in our 
halo and galaxy samples is 5 and 5.17 x 10 _3 /i _1 Mpc re- 
spectively. 

8.2 Results 

We conducted extensive comparisons for P n (r) and n v (r) in 
the galaxy and halo distributions, in both real and redshift 
spaces, finding excellent agreement. 

In Figure [2] (top panels) we show Pi(r) and -P<i(r) as 
a function of radius in the distribution of dark matter ha- 
los with mass larger than 6.6 x 10 11 /i _1 Mg in real space. 
The full line and symbols are the results obtained using our 
formalism and the Millennium Run respectively. The num- 
ber density of halos in this sample is 5 x 10 -3 /i -1 Mpc. In 
the bottom panels we show Pi(r) and Pi(r) obtained for 
the galaxies in redshift space. The symbols are the same 
than in the top panels. The number density of galaxies is 
5.17 x 10~ 3 /i _1 Mpc and the mass of the halos that are clus- 
tered as these galaxies is m g = 9.14 x 10 11 /i- 1 M Q [see eq. 
(|A9|) ]. We can see that, overall, the results obtained with 
our formalism are in excellent agreement with the statis- 
tics found in the Millennium Run both for halos and galaxy 
distributions. 

In Figure [3] we show the VPF for the same distribution 
of galaxies mentioned above. We can see that our formalism 
is reliable even to values of r much smaller than those ac- 
tually relevant for our purposes. Also, the fact that galaxies 
are clustered like halos above some mass (see eq. [3]), not 
only produce the correct VPF, but also leads to the cor- 
rect value for other statistics of underdense regions on the 
relevant scales, i.e. those larger than the halos themselves. 

In Figure [3] we show the comparison of predictions for 
the number densities of voids larger than r obtained from 
our formalism and those from the Millennium Run in red- 
shift space. In the left panel of this figure we show the ratio 
of the number density of voids at present to those at z = 1 
in the distribution of dark matter halos. In the right panel 
we plot the same ratio but for voids defined by galaxies. In 
both cases the value of the number density of the defining 
objects is 5 x 10 _3 /i~ 1 Mpc. At z = 1, the mass (lower limit) 
of the halos with the same number density as those at z — 

1 Both dark matter and galaxy catalogs can be downloaded from 
http: / /www. g-vo.org/MyMillennium2/ 
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Figure 3. The void probability function (VPF) for the same 
galaxies as figure J2J. See text for details. 

is m = 6.48 x lO 11 /i _1 M0 (see eq. A9). For galaxies, we 
have [using eq. ([16])] m g = 7.93 x IO^/i^Mq. We see that 
n v is accurately predicted by our formalism both for voids 
defined by halos and galaxies. Also, the dependence of red- 
shift is obtained with good accuracy for all voids, specially 
for those rare enough [K < 0.2, see eq. @] and relevant for 
our purposes. 

It is important to note that the predictions for P n (r) 
in halo distributions obtained using our formalism do not 
involve any fit to the simulations. However, in order to com- 
pute n v (r) using equation © , we have to fit two coefficients 
(3.5 and 2.18 in that equation). This was done using the 
Millennium Run and several other results from numerical 
simulation (see Patiri et al. 2006). These coefficients are par- 
ticularly important for common voids (e.g. voids larger than 
12h~ Mpc defined by the galaxies used above). However, for 
rare voids, the relevance of these coefficients is much smaller. 

For galaxy distributions, our predictions for P n (r) and 
n v (r) involves determining the mass of the halos (m g ) which 
are clustered like the galaxies. To this end we use m g as a free 
parameter and choose it so as to maximize the agreement 
between the predictions for P n (r) and n v (r) using our for- 
malism and the results found in the numerical simulations. 
We obtain m 9 =9.15 x 10 h~ Mq, and with this value we 
fitted the only free parameter in equation (I16p . Using this 
equation we may obtain the value of m g corresponding to 
any galaxy sample at any redshift and for any value of the 
cosmological parameters. 

Finally, in Table[T]we show the quantities Gr, G r , G CT8 , 
G^g, Gn and G^ generically defined by: 

G a = G(r, a); G' a = G'(r, a) (54) 

with G(r,a), G'(r,a) as defined in equations (|42|l and (I45[) . 
These quantities characterize the sensitivity of n„(r) and 
Po(r) to the generic parameter a. We also consider the 
quantities G~, and G'^: 
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Figure 2. Top panels: P\(r) and Pa{t) in the distribution of halos with mass larger than 6.6 X 10 11 h~ 1 M.Q in real space. The full line 
and symbols arc the results obtained using our formalism and the Millennium Run respectively. The number density of halos in this 
sample is 5 X 10 _3 7i _1 Mpc. Bottom panels: Pi(r) and Pi(r) obtained with our formalism and in the mock catalog for galaxies brighter 
than M r = —20.4 + 5logh in redshift space. The symbols are the same than in the top panels. 



9 DISCUSSION AND CONCLUSIONS 



G 7 = G(r, 7 ) = 
G 7 = G'(r, 7 ) = 
7 = 



d\nn v (r) 
dlri7 

rflnPo(r) 
dlri7 
diner (r) 



(55) 



These quantities are closely related to Gr, G' r and charac- 
terize the sensitivity of n v (r), Po(r) to the local logarithmic 
slope of a(r). They may be obtained from Gr, G' r through 
the relationship between F and 7(r). 



In this work we developed an analytical formalism deal- 
ing with the clustering properties of dark matter halos and 
galaxies in underdense regions. In particular, we extended 
an existing framework to account for redshift distortions and 
observational effects. We also included in the formalism the 
high precision conditional mass function recently published 
by Rubino et al. (2008). We showed that our formalism al- 
lows us to calculate accurately several void and underdense 
statistics, such as the P n (r) and n v (r) in dark matter halo 
distributions in both real and redshift spaces at any redshift. 
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Figure 4. Left panel: The ratio of the number density of voids at present to those at z = 1 in the distribution of dark matter halos 
obtained using our formalism (full line) and the Millennium Run (discrete symbols). Right panel: the same ratio but for voids defined 
by galaxies. In both cases the value of the number density of the defining objects is 5 X 10 — 3 /i — ^^Mpc. 
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12.0 


0.149 


1.816 


0.718 


2.065 


0.810 


1.500 


13.0 


0.430 


2.304 


0.904 


2.412 


0.898 


1.661 


14.0 


0.814 


2.857 


1.207 


2.790 


1.021 


1.825 


15.0 


1.327 


3.480 


1.438 


3.201 


1.318 


1.996 


16.0 


1.953 


4.174 


2.086 


3.646 


1.563 


2.170 



Table 1. The values of G and G' represent the sensitivity of the number densities of voids and the VPF to the parameters shown 
respectively. These numbers correspond to erg = 0.9, T = 0.1825 and n = 5 X 10 — 3 (/i _1 Mpc) — 3 . See text for details. 



Our predictions for Pn(r) are particularly remarkable, since 
they are the result of purely theoretical considerations. 

We also found that for galaxy distributions, Pn(r) and 
n v (r) may be obtained, to a very good approximation, as- 
suming that the galaxies have the same clustering properties 
as halos above a given mass m g . We deduced a relationship 
between this mass and that of halos with the same accu- 
mulated number density as the galaxies. Similar approaches 
have shown very useful to describe other properties of galaxy 
clustering (e.g. Conroy, Wechsler & Kravtsov 2006). The 
equation we obtained contains a single free parameter that 
we fitted using numerical simulations at z = 0, leading to 
predictions for P n (r) and n v (r) that are remarkably good 
at any redshift compared to those statistics found in the 
simulations. 

We found that the dependence of P n (r) and n v (r) on 
redshift is small, with n v (r) changing less than 20% between 
z — 1 and z — for voids with radius larger than 13/i _1 Mpc. 
This is due to the fact that in cosmologies with fi m < 1, 



the redshift distorsions are more effective (for a given value 
of S) at higher redshift. This partially compensates (up to 
z ~ 1) for the smaller amplitude of density fluctuations. 
However, the dependence of P n (r) and n v (r) on as and V is 
considerably larger, making them important to use as tests 
to measure these parameters. We showed how to construct 
efficiently several of these tests and discussed in detail the 
treatment of several observational effects. Correcting for the 
biases implied by these effects may be necessary for an ac- 
curate measurement of cosmological parameters by means 
of P n (r) and n„(r). 

From the cosmological parameter estimation point of 
view, there is a close symmetry between voids and clusters. 
In fact, rich clusters and large voids (which set the strongest 
constrains to cosmological parameters) correspond to re- 
gions of co-moving scale of ~ 5/i _1 Mpc. The main difference 
between them is that clusters collapse to form structures of 
about ~ 2 — 3/i _1 Mpc while proto- voids expands to voids of 
~ 13/i _1 Mpc. In both cases the efficiency lay on how pre- 
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cisely the mass of the underlying dark matter can be deter- 
mined. For clusters, the main source of uncertainty comes 
through the different methods to determine their masses, 
such as the temperature-mass relation when using x-rays 
(Vikhlinin et al. 2008) , while voids do not present this prob- 
lem. Moreover, voids might be specially interesting to mea- 
sure the normalization of the amplitude of density fluctua- 
tions since they are tracing comparable scales, while clusters 
are considerably smaller. 

The currently available (2dFGRS, SDSS) and next gen- 
eration (e.g. BOSS) large galaxy redshift surveys in combi- 
nation with the analytical formalism presented in this paper, 
will allow us to estimate the values of the relevant cosmolog- 
ical parameters using void statistics, providing independent 
measurements. These, along with the estimations made by 
other methods will contribute to reduce their uncertainties. 
It is also worth to mention that in a forthcoming paper we 
will explore the dependence of the statistics of voids with <jg 
and F, using the void statistics found in the 2dFGRS (Patiri 
et al., in prep.). All this makes the statistics of underdense 
regions a very promising tool to constrain cosmology not 
just with the current surveys but also to next generation 
high redshift surveys. 
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APPENDIX A: EXPLICIT COMPUTATION OF 

Pn(r) 

In this Appendix we show the detailed procedure for com- 
puting Pn(r) and the void statistics for given values of erg 
and T. 

P»(r) = -. f P{8 l ,r)[u{8 l )] n e [ -< Sl)] d8 l (Al) 
nl ./_7 



u(Si) = nV[l + DELF(<5(, r)]Ae ' (A2) 
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DELF is a function of Si and r that gives the mean actual 
density contrast within a sphere with radius r with enclosed 
linear density contrast Si. 



very little with these variables up to z ~ 1, so we chose to 
held it fixed. In the case of m g the scaling can be approxi- 
mated by: 



DELF (5, 



i,r 



1 + DELT(<5 i 



|1 - £[1 + DELT(5 ; )] 2/3 k(r[l + DELT(<5 ; )] 1/3 )] 2 



(A3) 



where DELT((S;) denotes the relationship between the actual 
and linear enclosed density contrasts in the spherical colapse 
model (see PBP06 for details). 



1 + DELT(5 ; ) ~ (1 - 0.607^ )" 



(A4) 



u(Q) is the rms of the linear density contrast on a sphere 
with Lagrangian radius Q. In this equation, a(Q) is evalu- 



ated at Q equal to r[l + DELT(<5 ! 



11/3 



Explicit equations 



for a are given at the end of this Appendix. A, b in eq. (A2) 
are also functions of Si given by: 



A ee A(m, Q = r[l + DELT(<5( 



1V31 



\ 0.9 I \0.21 J 



b(m,Q = r[l + DELT(<5 ; 



(A5) 



(A6) 



( BMoi V 
I 0.9 ) 



' r ' 

v 0.21 / 



where A(m, Q), b(m, Q) are functions of the mass of the ob- 
jects and the Lagrangian radius of the regions being consid- 
ered (that corresponding to an Eulerian sphere with radius 
r, see Rubifio et al. 2008): 



A(m,Q)= [1.577 - 0.298(f)] - 

- [0.0557 + 0.0447(f )]ln(m) - 

- [0.00565 + 0.0018(f ][ln(m)] 2 

b(m,Q)= [0.0025 - 0.00146(f)] + 

+ [0.121 - 0.0156(f)] x 

y TO [0.335+0.019(f )] 



(A7) 



(A8) 



3.51xl0 11 h- 1 M< 







,0^ 



(A9) 



where M is the mass of the objects, D(z) in equations A5,A6 
is the linear growth factor normalized to be 1 at present. The 
exponents determining the dependence on A, b on as and 
redshift are slightly different from those given by Rubifio 
et al. (2008) but are within the precision afforded by the 
procedure used in that work. The exponents given here have 
been accurately fitted using numerical simulations. 

For dark matter halos, the values of m entering in these 
last equations is defined by: 



i(> m) — ^-sample 



(A10) 



while for galaxies we have to use m g given in equation (I16p . 
The definition of m and m g imply that these quantities have 
to be scaled with erg, F and redshift. However, m changes 



m g (as, F, z) = m(1.396) 



(D(z)a s /0.9)(ryo.21)° 



(All) 



We use for the probability distribution P(Si,r) of the 
linear density contrast within an Eulerian space given by 
Betancort-Rijo & Lopez-Corredoira (2002): 



P(8i,r) = 
x 
x 



2 (a(r[l+DELF(J,,r)] 1 /3)2 



t(Si,r)= 0.54 + 0.173 x In ( 



l + DELF( ( 5 i ,r)]- (1 -* ) x 

_d_ ( h , \ 

dSi ^(rll+DELFf^.rU'/S) j 

[l+DELTt^,)] 1 / 3 



(A12) 



Even though a depends on V, and the above equation 
corresponds to F = 0.21, this dependence is not relevant for 
our purposes. 

To obtain <j(Q) we use the standard BBKS power spec- 
trum (Bardeen et al. 1986). We also checked other power 
spectra (e.g. Einseinstein & Hu 1999), finding no significant 
differences in the final results, i.e. 



a(Q) ee a(Q,F) ~ a s A(r)Q^ B ^- c ^ 
A(T) = 2.01 + 3.9r ; B(F) = 0.2206 + 0.361F 1 - 5 

c(r) ee o.i82 + o.04iiZrc(r) 



(A13) 



This fit is valid for Q > 3/r x Mpc and 0.1 > Y > 0.5. 

To obtain P*(r*), i.e. the probability that a sphere of 
radius r* in redshit space contains n objects when placed 
at random within the distribution, we have to implement 
the replacements given in equations (16) and (17) in all ex- 
pressions entering in eq. (Al) but for computational reasons 
we choose to follow an equivalent procedure whereby r is 
replaced by: 



[1 - VELffl] 4 - 1 
-4VEL(5) 



1/3 



(A14) 



where the function VEL(<5) is defined so that the peculiar 
velocity, V, of mass element at distance r from the center of 
a spherical mass concentration (or defect) enclosing actual 
density contrast 8 is given by: 



V = H r VEL(S) 



(A15) 



where H is the Hubble constant at the time being consid- 
ered. Betancort-Rijo et al. (2006) showed that: 



w 3 dlna 1 + 5 \dS v 'J 

(A16) 

D(a) is the growth factor as a function of the expansion 
factor, a, and DELK(5) is the inverse function of DELT(<5;) 
(see Sheth & Thormen 2002): 
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DELK(<5) 



(1.68647 



1.35 



1.68647 * 

1.12431 



+ 



(1+5)2/3 

0.78785 



-) 



(A17) 



(1 + (5)V2 (1 + ^)0.58661. 

5 C is the linear density contrat for spherical collapse model, 
which for the concordance cosmology at present is 1.676. The 
logarithmic derivative of D(a) is, for a given cosmology, a 
function of redshift. We can approximate it as: 

0.6 



dlnD(a) 
dlna 



0.47 



n m [(i + z f + n A /n m ] 



(A18) 



It must be noted that, although r has to be replaced by 
eq. A2 in all its appearances in eq. IA14I for computational 
reasons we only implement that replacement on P(8i,r) and 
in the explicit appearance of r in eq. 



for this is that within samples with a fixed value of ft' the 
field of density contrast linearly extrapolated to the present 
behave, with high precision, like a uniform Gaussian field 
with a constrained power spectrum (so that on any scale 
a(r)' 2 = a(r) 2 - S 2 , see Rubino et al. 2008). This constraint 
on the power spectrum implies that at a given scale ro the 
value of a(ro) is slightly different from the unconstrained 
one, and that the local shape of <r(r) in the neighborhood 
of ro (for any ro) also changes. As we saw before, n„(ro) 
depends both on cr(ro) and on the local shape of cr(r), quan- 
tified by G 7 . Therefore, we can write: 



N(r ) res (n = (3n ) 
S 2 



N(r )(n = pn') 



cr 2 (r ) / 

We may also use the approximation: 



(B4) 



APPENDIX B: 
VARIANCE 



THE EFFECT OF COSMIC 



In all the equations used in this work, the mean number 
density of objects were assumed to be obtained from an ar- 
bitrarily large sample. The question is which value of n must 
be used to obtain the expected theoretical value of voids 
within a specified sample: must we use the universal mean 
number density, when this is available from a much larger 
sample, or the mean within the sample under consideration? 
Conversely, if only the local mean is available, will it bias 
the expected number of voids within the sample? 

Equation (29) gives (for samples at fixed z) the expected 
number of voids, N(r) (note that N(r) = A^i(r)), within a 
sample of given shape and volume, chosen randomly within 
a much larger volume with mean number density ft. N(r) 
can be also expressed in the form: 



N(r) 



/ Tr^y — 



N(r) | n 



V 



dn 



(Bl) 



N(r) | n' represents the expected number of voids within the 
sample, conditional to having a mean number density within 
it equal to n' . The parenthesis represents the probability 
distribution of n' , which for large enough samples is simply 
a Gaussian with mean n and variance a\ given by: 



5 2 + 



1 



(B2) 



where S 2 is the variance of the density fluctuations within 
the sample, which for the large sample volumes, V s , usually 
considered may be approximated by its linear value. 

Equation (36) can be solved using the following ansatz: 



N(r) | n =N(r) re *(n = pn'); 

(<r(r) 2 — > c 2 (r) — S 2 ) 



(B3) 



where the right hand side is equation (|36[) considered as a 
function of n evaluated at f3n' , f3 is a parameter to be de- 
termined, equation (|36[) should be evaluated using the con- 
strained cr(r) indicated in the last parenthesis. The reason 



N(r )(n') ~ N(r )(n ) 



no 



(B5) 



The values for G 7 (= G(r, 7)), G s (= G(r, n), see equa- 
tion (|42[) for a general definition) are given in section Q. 
Using expressions (B3) and (B4) in (B5) we find, after elim- 
inating N(r): 



1 + 



I + GdG :~ 1] a 2 n ) P G - (B6) 



° 2 {r )J V 2 

Therefore, for P « o 2 {r ) 



13*1- ((g„ - G,)^- + G ^°; - l) A ± (B7) 
V o^(ro) 2 J On 

For most relevant voids G 7 , Gn are close to 1, and for all 
the samples that we shall consider S 2 < 10~ 2 . Therefore, (3 
is within 1% or 2% percent from unity, implying that: 



N(r) I n ~ N(r)(n) 



(B8) 



to a very high degree of accuracy. That is, the conditional 
expected number of voids within a sample with certain den- 
sity n' is the same as the expected number of voids within 
a random sample with the same shape and volume chosen 
at random within a much larger volume with mean density 
n'. For very small samples, however, we would be under- 
estimating the conditional number of voids if we use the 
unconditional expected number corresponding to the actual 
number density within the sample. 

It is the conditional expected number of voids that we 
use to run tests for cosmological parameters because it leads 
to a somewhat more restrictive test (it uses more informa- 
tion). Furthermore, the constraint on as coming out from a 
test based on values of n' within the samples may be com- 
bined with that coming from tests using voids statistics as if 
they were independent tests. It must be noted, however, that 
a subdivision of the sample into smaller samples to use the 
conditional void statistics to conduct the test may lead to a 
further small increase of the efficiency of the test but only 
in as much as (3 remains very close to 1 for the subsamples. 
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Radius [h _1 Mpc] 


pi 

Ml 


psim 



n 

z 


o.ooUU 


o.oOfU ± U.Ulz4i 


Q 
O 


/ .OoUU 


i.ZoDU it U.UZ4oU 


I 

1 


o.ouuu 


0.4/bU ± U.UozoU 


E 



o nnnn 


o. (obU ± U.UoblU 


ft 
u 


Z.44UU 


o /io/in J_ n riQ^/in 
Z.4Z4U ± U.Uo04U 


/ 


i a a nn 
1.44UU 


i a /inn J_ n nQonn 
1.44UU ± U.UozUU 


8 


0.7920 


0.8020 ±0.02700 


9 


0.4123 


0.4177 ± 0.02020 


10 


0.2034 


0.2041 ±0.01400 


11 


0.0951 


0.0936 ±0.00873 


12 


0.0422 


0.0403 ±0.00500 


13 


0.0178 


0.0161 ±0.00271 


11 


0.0071 


0.0060 ±0.00132 


15 


0.0027 


0.0019 ±0.00059 



Table C2. Comparison for the VPF (Pq) in a (redshift space) 
mock galaxy distribution obtained using our formalism (2nd col- 
umn, /) and numerical simulations (3rd column, sim). The error 
is the rms over an ensemble of 60 mock catalogs. These results 
were computed for galaxies brighter than M r = —20.4 + hlogh. 
All the values of the VPF are given in units of 10 . 



APPENDIX C: TABLES WITH THE 
COMPARISON BETWEEN OUR FORMALISM 
AND THE MILLENNIUM RUN 

In this Appendix we provide the tables with the values cor- 
responding to the figures of section [5] which allows a more 
quantitative comparison of the results found using our for- 
malism to with those found in the Millennium Run. We also 
present numbers for other underdense statistics as well. 
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Radius [h 1 Mpc] 


n=0 (sim) 


(/) 


1 (sim) 


1 (/) 


2 (sim) 


2 (/) 


3 (sim) 


3 (/) 


4 (sim) 


4 (/) 


9 


2.5600 ± 0.2300 


2.5150 


3.0170 


3.1540 


3.3180 


3.5330 


3.5730 


3.7640 


3.7530 


3.9010 


10 


1.1240 ± 0.1380 


1.0870 


1.5340 


1.5630 


1.8160 


1.9050 


2.0770 


2.1730 


2.2770 


2.1730 


11 


0.4568 ± 0.0970 


0.4390 


0.7228 


0.7123 


0.9196 


0.9420 


1.1040 


1.1420 


1.2540 


1.3210 


12 


0.1705 ± 0.0378 


0.1654 


0.3144 


0.3008 


0.4266 


0.4281 


0.5334 


0.5504 


0.6374 


0.6681 


13 


0.0561 ± 0.0095 


0.0581 


0.1246 


0.1173 


0.1884 


0.1796 


0.2407 


0.2440 


0.2971 


0.3103 


14 


0.0176 ± 0.0038 


0.0190 


0.0436 


0.0462 


0.0765 


0.0695 


0.1026 


0.0998 


0.1300 


0.1327 


15 


0.0043 ± 0.0013 


0.0058 


0.0137 


0.0156 


0.0269 


0.0249 


0.0409 


0.0376 


0.0528 


0.0523 



Table CI. P n (r) for dark matter halos with masses larger than 6.6 X 10 11 /i _1 Mq in real space. The number density of these halos is 
5 X 10 _3 (/i _1 Mpc) -3 . Here we present results for the range n=0 to 4 for the Millennium Run (Sim) and the corresponding predictions 
obtained using our formalism (/). Note that all P n values are in units of 10~ 2 . 



Radius [/i _1 Mpc] 


P[ 


rtsim 


pi 




8 


7.0660 


7.2540 ±0.124 


5.8540 


6.0260 ± 0.042 


9 


4.2350 


4.4570 ±0.126 


4.2750 


4.6270 ± 0.051 


10 


2.3700 


2.5130 ±0.155 


2.8520 


3.1870 ± 0.060 


11 


1.2430 


1.3220 ±0.083 


1.7600 


1.9870 ± 0.059 


12 


0.6137 


0.6477 ± 0.054 


1.0120 


1.1440 ± 0.053 


13 


0.2853 


0.2936 ± 0.032 


0.5451 


0.6053 ± 0.004 


14 


0.1251 


0.1244 ±0.031 


0.2755 


0.2963 ± 0.029 


15 


0.0517 


0.0500 ±0.011 


0.1307 


0.1360 ± 0.017 



Table C3. P\ and P4 for the same galaxy distribution as in Table 2. The superscripts / and sim denote results obtained using our 
formalism and numerical simulations respectively. All P„ are given in units of 10~ 2 . 



Radius [/i^Mpc] nl (halos) fi% lm (halos) nl (gals) n s v im (gals) 



9 


6.325 


6.460 ± 0.090 


6.318 


6.591 ± 0.084 


10 


3.377 


3.348 ± 0.058 


3.531 


3.471 ± 0.061 


11 


1.795 


1.906 ± 0.044 


2.040 


2.040 ± 0.047 


12 


0.969 


1.000 ± 0.032 


1.093 


1.120 ± 0.036 


13 


0.511 


0.510 ± 0.023 


0.596 


0.598 ± 0.026 


14 


0.256 


0.236 ± 0.016 


0.301 


0.289 ± 0.019 


15 


0.113 


0.103 ± 0.011 


0.141 


0.130 ± 0.010 


16 


0.047 


0.041 ± 0.007 


0.059 


0.051 ± 0.008 



Table C4. Simulation results versus theoretical predictions for the number density of voids denned alternatively by halos and galaxies 
[both with number density 5 X 10~ 3 (ft~ 1 Mpc)~ 3 ] in rcdshift space in the Millennium run at z = 0. Number densities are given in units 
of 10- 5 (h~ 1 Mpc)- 3 . See text for details. 



Radius [h x Mpc] 


nl (halos) 


n% lm (halos) 


nl (gals) 


n s v im (gals) 


9 


6.200 


6.700 ± 0.085 


6.258 


6.681 ± 0.085 


10 


3.247 


3.378 ± 0.060 


3.298 


3.429 ± 0.060 


11 


1.693 


1.907 ± 0.046 


1.865 


1.963 ± 0.045 


12 


0.906 


0.971 ± 0.033 


0.973 


0.998 ± 0.032 


13 


0.464 


0.465 ± 0.022 


0.508 


0.508 ± 0.024 


14 


0.223 


0.207 ± 0.015 


0.238 


0.232 ± 0.017 


15 


0.092 


0.082 ± 0.009 


0.103 


0.092 ± 0.011 


16 


0.035 


0.026 ± 0.006 


0.038 


0.032 ± 0.006 



Table C5. The same as in Table 4 but for z = 1. Again, number densities are given in units of 10- 5 (/i" 1 Mpc)- 3 . 
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